AITopics | Augusta

Collaborating Authors

Augusta

Unsupervised decoding of encoded reasoning using language model interpretability

arXiv.org Artificial IntelligenceDec-9-2025

As large language models become increasingly capable, there is growing concern that they may develop reasoning processes that are encoded or hidden from human oversight. To investigate whether current interpretability techniques can penetrate such encoded reasoning, we construct a controlled testbed by fine-tuning a reasoning model (DeepSeek-R1-Distill-Llama-70B) to perform chain-of-thought reasoning in ROT-13 encryption while maintaining intelligible English outputs. We evaluate mechanistic interpretability methods--in particular, logit lens analysis--on their ability to decode the model's hidden reasoning process using only internal activations. We show that logit lens can effectively translate encoded reasoning, with accuracy peaking in intermediate-to-late layers. Finally, we develop a fully unsupervised decoding pipeline that combines logit lens with automated paraphrasing, achieving substantial accuracy in reconstructing complete reasoning transcripts from internal model representations. These findings suggest that current mechanistic interpretability techniques may be more robust to simple forms of encoded reasoning than previously understood. Our work provides an initial framework for evaluating interpretability methods against models that reason in non-human-readable formats, contributing to the broader challenge of maintaining oversight over increasingly capable AI systems.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2512.01222

Country:

North America > United States > Illinois > Sangamon County > Springfield (0.14)
North America > United States > Illinois > Cook County > Chicago (0.07)
North America > United States > California > Sacramento County > Sacramento (0.05)
(22 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)
(2 more...)

Add feedback

Rise Of The Robot Bees: Tiny Drones Turned Into Artificial Pollinators

NPR TechnologyMar-3-2017, 13:15:04 GMT

An artist's illustration shows how a remote-controlled drone might one day be used to pollinate flowers. Courtesy of Dr. Eijiro Miyako hide caption An artist's illustration shows how a remote-controlled drone might one day be used to pollinate flowers. Near Esparto, in the beautiful Capay Valley region of central California, 1,400 young almond trees flourish in a century-old orchard overlooking the hills. Since November, they've stood in perfect rows without a hint of foliage -- resting, naked and dormant, for the upcoming growing season. Their branches now swell with bright pastel blooms in preparation for pollination. Like most almond growers, Brian Paddock, owner of Capay Hills Orchard, relies on bees to provide this important aspect of crop development.

artificial intelligence, miyako, pollination, (14 more...)

NPR Technology

Country:

North America > United States > Minnesota (0.05)
North America > United States > Maine > Kennebec County > Augusta (0.05)
North America > United States > California > Riverside County > Riverside (0.05)
Asia > Japan (0.05)

Industry: Food & Agriculture > Agriculture (0.89)

Technology:

Information Technology > Artificial Intelligence > Robots (0.89)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.40)

Add feedback

Teaching Machines to Learn on Their Own

AITopics Original LinksJan-24-2017, 11:09:24 GMT

Steve Mirsky: Welcome to Scientific American's, Science Talk, posted on November 10, 2015. A short episode today for which I'll turn it over now to Scientific American's associate tech editor, Larry Greenemeier. Larry Greenemeier: Computers have always been good at doing things that are really complicated for us humans. On the other hand, computers have a really hard time recognizing a particular voice or face in a crowd; something most kids learn to do before they're even out of diapers. But things are changing fast. Over the next decade or so, machines will more easily mimic inherently human abilities.

artificial intelligence, computer, machine learning, (14 more...)

AITopics Original Links

Country:

North America > United States > New York (0.05)
North America > United States > Maine > Kennebec County > Augusta (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.05)
(2 more...)

Technology:

Information Technology > Communications > Mobile (0.50)
Information Technology > Artificial Intelligence > Machine Learning (0.37)

Add feedback